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DETAILED ACTION 
Response to Amendment 

1 . In response to the office action from 3/1 8/2008, the applicant has submitted an 
amendment, filed 6/17/2008, amending independent claims 16, 19, 23, and 27, while arguing to 
traverse the art rejection based on the limitations regarding calculating estimated weights, 
marking utterance sections, and using weighted sections for speaker independent-dependent 
model conversion (Amendment, Pages 10-11). Applicant's arguments have been fully 
considered, however the previous rejection is maintained due to the reasons listed below in the 
response to arguments. 

2. In response to the amendment of claims 16, 19, 23, and 27 (Amendment, Page 9), which 
includes that the instructions are executable and executed by a processor to realize the practical 
application fimctionality of the presently claimed invention, the examiner has withdrawn the 
previous 35 U.S.C. 101 rejection. 

Response to Arguments 

3 . Applicant's arguments have been fully considered but they are not persuasive for the 
following reasons: 
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With respect to the independent claims (1, 8, 16, and 23), the applicant argues that 
Barnard et al (U.S. Patent: 7,216,079) first fails to teach calculated estimated weights for an 
identified error in recognition of utterances based on a reference string because the segment 
alignment data in Barnard only represents boundaries and there is no teaching of a reference 
string (Amendment, Page 10). 

In response, the examiner points out that while Barnard does utilize alignment data in 
detecting speech recognition errors for model training, the alignment data is not what was relied 
upon in the previous Office Action for teaching the "estimated weights". Instead, it is the 
determined shifting amount that is applied to mean vectors of acoustic models for training that 
anticipates this claimed feature (Col. 3, Line 64- Col. 4, Line 11; and Col. 6, Lines 21-39). This 
shifting amount represents a numerical modification to the mean values of a speech recognition 
model, and thus, effectively weights an acoustic model numerically in one direction or another 
(i.e., closer or further) based on a correct/incorrect speech recognition decision (Col. 3, Line 64- 
Col. 4, Line 11). Also, Barnard's comparison involves a cross-comparison involving a correct 
phoneme string sequence that corresponds to a word. It is through comparison with this 
reference string that recognition errors are determined (Col. 6, Lines 21-39). Thus, Bamard does 
teach a reference string. Therefore, for at least the above reasons, the applicant's first argument 
has been fully considered, but is not convincing. 

With respect to the independent claims, the applicant secondly argues that Bamard 
merely discloses alignment between different segmentations and not marking sections as being 
misrecognized and further alleges that since there are no weights, there can be no associating of 
weights and sections (Amendment, Page 10). 
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In response, the examiner notes that Barnard does teach marking sections as being 
misrecognized. In his comparison, Barnard matches up a second alignment with a correct, 
reference sequence string. As Barnard proceeds through this comparison, wrong phonemes and 
correct phonemes within the second sequence are marked with respect to the reference string 
("correct and wrong", "incorrectly recognized", Col. 6, Lines 21-39). Thus, since Barnard 
sequentially compares a second utterance with a reference to determine those phoneme sections 
that are wrong or "incorrectly recognized", Barnard does anticipate the section marking recited in 
the presently claimed invention. Furthermore, the shifting values or weights in Barnard are 
assigned to the sections based on this marking (Col. 3, Line 64- Col. 4, Line 5). Thus, the 
applicant's second argument has been fully considered, but is not convincing. 

Finally, with respect to the independent claims, the applicant argues that Bamard only 
teaches moving incorrect phonemes away from a mean value and does not teach using weighted 
utterance sections to convert a speaker independent model to a speaker dependent model 
(Amendment, Pages 10-11). 

In response, the examiner notes that in Barnard's process is directed to training a speech 
recognition model (Col. 3, Line 64- Col. 4, Line 11). The data used to train/modify the initial 
system model is based on utterances received from a particular application user (Col. 6, Lines 
40-46). An initial, untrained correct acoustic model in Bamard would not involve any received 
user speech data, but is trained over time according to speech received from a user. In this way, 
the fraining process of Bamard begins with a non-user based acoustic model and ends in a user- 
frained or speaker dependent model (i.e., it converts SI to SD). Thus, for at least this reason, the 
applicant's last argument has been flilly considered, but is not convincing. 
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The applicant's arguments with respect to Claim 5 (Amendment, Page 13) are identical to 
those presented in the response filed on 12/26/2007. In the subsequent Office Action from 
3/18/2008 (Pages 3-4), the examiner explained why the two equations are equivalent and that 
motivation was provided for including the teachings of Junqua (U.S. Patent: 6,253,181). Since 
these arguments are similar to those presented previously and since the applicant has not 
specifically addressed the corresponding examiner response, please see Pages 3-4 of the Office 
Action from 3/18/2008 in regards to these arguments. 

The art rejections of the respective dependent claims are fraversed for reasons similar to 
the independent claims (i.e., I, 8, 16, and 23) (Amendment, Page 14). In regards to such 
arguments see the above response directed towards the independent claims. 

Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S. C. 102 that form the 
basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by another filed 
in the United States before the invention by the applicant for patent or (2) a patent granted on an application for 
patent by another filed in the United States before the invention by the applicant for patent, except that an 
international application filed under the treaty defined in section 351(a) shall have the effects for purposes of this 
subsection of an application filed in the United States only if the international application designated the United 
States and was published under Article 21(2) of such treaty in the Enghsh language. 

5. Claims 1-3, 7-11, 15-18, 22-26, and 30 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Barnard et al (U.S. Patent: 7,216,079). 

With respect to Claim 1, Bamard discloses: 
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Calculating estimated weights for identified errors in recognition of utterances based on a 
reference string (using reference strings to identify incorrectly recognized utterance sections and 
determining associated training weights, Col. 3, Line 64- Col. 4, Line 11; Col. 5, Lines 16-26; 
Col. 6, Lines 21-39; and Col. 9, Lines 47-67; and Fig. 3); 

Marking sections of the utterances as being misrecognized and associating the estimated 
weights with the sections of the utterances (utterance segment locations that are incorrectly 
recognized are selected and associated with a training weight shift value, Col. 6, Lines 21-39; 
Col. 8, Lines 51-60; Col. 9, Lines 47-67; and Fig. 3); 

Using the weighted sections of the utterances to convert a speaker independent model to a 
speaker dependent model (weighted utterance segments are used to gradually train an initial 
model for a particular speaker. Col. 3, Line 64- Col. 4, Line 11; and Col. 6, Lines 40-61). 

With respect to Claim 2, Barnard further discloses: 

The method steps (a)-(c) are repeated at least once (repeated processing is performed. 
Col. 3, Line 64- Col. 4, Line 11; and Col. 6, Lines 40-46). 
With respect to Claim 3, Barnard further discloses: 

The utterances are converted into a recognized phone string a first time through applying 
the speaker independent model and thereafter through applying the most recently obtained 

speaker dependent model (recognizer creates phoneme strings using an initial model that is 
gradually/repeatedly trained. Col. 3, Line 64- Col. 4, Line 11; and Col. 6, Lines 40-46; and Fig. 
3). 

With respect to Claim 7, Barnard fiirther discloses: 
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Different misrecognized words have different weights (variable weighting value, Col. 9, 
Lines 47-67). 

With respect to Claim 8, Barnard discloses: 

Recognizing utterances through converting the utterances into a recognized string (speech 
recognition generates a phoneme string, Col. 7, Lines 13-46); 

Comparing the recognized string with a reference string to determine errors (location of 
errors is determined by comparing correct reference string and recognized string, Col. 8, Lines 
51-60); 

Calculating estimated weights for sections of the utterances (using reference strings to 
identify incorrectly recognized utterance sections and determining associated training weights. 
Col. 3, Line 64- Col. 4, Line 11; Col. 5, Lines 16-26; Col. 6, Lines 21-39; and Col. 9, Lines 47- 
67; and Fig. 3); 

Marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data (utterance segment locations that are incorrectly recognized are 
selected and associated with a training weight shift value. Col. 6, Lines 21-39; Col. 8, Lines 51- 
60; Col. 9, Lines 47-67; and Fig. 3); and 

Using the adaptation enrolhnent data to convert a speaker independent model to a speaker 
dependent model (weighted utterance segments are used to gradually train an initial model for a 
particular speaker. Col. 3, Line 64- Col. 4, Line 11; and Col. 6, Lines 40-61). 

With respect to Claim 9, Barnard further discloses: 
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The utterances are converted into the recognized string through applying the speaker 
independent model (initial recognition model that is to be gradually adapted, Col 3, Line 64- 
Col. 4, Line 15). 

With respect to Claim 10, Barnard further discloses: 

Parts (b)-(e) are repeated until differences between the reference and recognized strings 
are less than a threshold (corrective action is only taken until a difference greater than a 
closeness threshold (i.e., below an effective threshold measure of similarity) is reached, Col. 9, 
Lines 11-26; and Col. 10, Lines 14-22). 

Claim 11 contains subject matter similar to Claim 3, and thus, is rejected for the same 
reasons. 

Claim 15 contains subject matter similar to Claim 7, and thus, is rejected for the same 
reasons. 

With respect to Claim 16, Barnard discloses the method for marking and weighting 
misrecognized utterance sections for speaker training as applied to claim 1, implemented as a 
computer readable medium storing a program executable by a computer (Col. 11, Line 49- Col. 
12, Line 18). 

Claims 17-18 contain subject matter respectively similar to Claims 2-3, and thus, are 

rejected for the same reasons. 

Claim 22 contains subject matter similar to Claim 7, and thus, is rejected for the same 
reasons. 

With respect to Claim 23, Bamard discloses the method for marking and weighting 
misrecognized utterance sections for speaker training as applied to claim 8, implemented as a 
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computer readable medium storing a program executable by a computer (Col. 11, Line 49- Col. 
12, Line 18). 

Claims 24-26 contain subject matter respectively similar to Claims 9-11, and thus, are 
rejected for the same reasons. 

Claim 30 contains subject matter similar to Claim 7, and thus, is rejected for the same 
reasons. 

Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A palcnl may nol be oblaincd Ihough the invention is not identically disclosed or described as set Ibrth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

7. Claims 5, 13, 20, and 28 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Barnard et al in view of Junqua (U.S. Patent: 6,253,181). 

With respect to Claims 5, 13, 20, and 28, Barnard discloses the method for marking and 
weighting misrecognized utterance sections for speaker training, as applied to Claims 1,8, 16, 
and 23. Nguyen does not specifically disclose that calculation of a weighting score that 
computes an average likelihood difference per frame, however Junqua discloses a calculation of 
a likelihood difference used in determining a speaker adaptation that utilizes an average of 
likelihood difference scores associated with an incorrect recognition (Col. 4, Lines 9-24; and 
Col. 5, Lines 15-67). Junqua fiirther discloses an equation similar to that recited in claim 5 for 
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determining a log-Iikelihood difference in a speaker adaptation process that utilizes an average of 
likelihood scores (Col. 5, Lines 15-67; and Col. 4, Lines 9-24). 

Barnard and Junqua are analogous art because they are from a similar field of endeavor in 
speaker adaptation systems. Thus, it would have been obvious to a person of ordinary skill in the 
art, at the time of invention, to modify the teachings of Barnard with the likelihood difference 
calculation taught by Junqua in order to implement a high speed speaker adaptation system that 
is capable of providing a measure of recognition reliability (Junqua, Col. 3, Lines 29-31; and 
Col.4, Lines 9-24). 

Allowable Subject Matter 

8. Claims 4, 12, 19, and 27 are allowable over the prior art of record. 

9. The following is an examiner's statement of reasons for allowance: 

With respect to Claims 4, 12, 19, and 27, the prior art of record fails to explicitly teach 
or fairly suggest a method or computer readable medium storing a program executed by a 
computer for speaker adaptation that utilizes estimated weights based on misrecognized speech 
utterances as respectively recited in claims 4 and 12, wherein the estimated weights are 
calculated by computing an average likelihood difference per frame and then computing a weight 
value by averaging the average likelihood difference over error words (specification, page 6). 

Although Barnard et al (U.S. Patent: 7,216,079) discloses that it is well known in the 
prior art to mark and weight misrecognized utterance sections for speaker training (Col. 3, Line 
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64- Col. 4, Line 11; Col. 5, Lines 16-26; Col. 6, Lines 21-39; and Col. 9. Lines 47-67; and Fig. 
3) and Junqua (U.S. Patent: 6,253,181) teaches an equation for calculating an average 
likelihood difference, as applied to claim 5, Junqua does not teach averaging the average 
likelihood difference over all error words to determine a weight for speaker adaptation of a 
speech recognition model. Thus, claims 4 and 12 are allowable over the prior art of record. 

Any comments considered necessary by applicant must be submitted no later than the 
payment of the issue fee and, to avoid processing delays, should preferably accompany the issue 
fee. Such submissions should be clearly labeled "Conmients on Statement of Reasons for 
Allowance." 

10. Claims 6, 14, 21, and 29 are objected to as being dependent upon a rejected base claim, 
but would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

1 1 . The following is a statement of reasons for the indication of allowable subject matter: 
With respect to Claims 6, 14, 21, and 29, the prior art of record fails to explicitly teach 

or fairly suggest a method for speaker adaptation that utilizes estimated weights based on 
misrecognized speech utterances, wherein the estimated weights are calculated by multiplying an 
average likelihood difference per frame calculated using the equation recited in claims 5, 13, 20, 
and 28 by the inverse of a number of misrecognized words for a particular speaker as per the 
equation recited in claims 6, 14, 21, and 29. 
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Although Barnard et al (U.S. Patent: 7,216,079) discloses that it is well known in the 
prior art to mark and weight misrecognized utterance sections for speaker training (Col. 3, Line 
64- Col. 4, Line 11; Col. 5, Lines 16-26; Col. 6, Lines 21-39; and Col. 9, Lines 47-67; and Fig. 
3) and Junqua (U.S. Patent: 6,253,181) teaches an equation for calculating an average 
likelihood difference, Junqua does not teach multiplying the calculated average likelihood by the 
inverse of a number of misrecognized words for a particular speaker as per the equation recited 
in claims 6, 14, 21, and 29. 

Conclusion 

12. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure: SeePTO-892. 

1 3 . Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to James S. Wozniak whose telephone number is (571) 272-7632. 
The examiner can normally be reached on M-Th, 7:30-5:00, F, 7:30-4, Off Altemate Fridays. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached at (571) 272-7603. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained trom the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 

system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



/James S. Wozniak/ 

Patent Examiner, Art Unit 2626 



